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A functional expression is introduced that relates scattered 
X-ray intensities from a still or a rotation snapshot to the 
corresponding structure -factor amplitudes. The new approach 
was implemented in the program nXDS for processing 
monochromatic diffraction images recorded by a multi- 
segment detector where each exposure could come from a 
different crystal. For images containing indexable spots, the 
intensities of the expected reflections and their variances are 
obtained by profile fitting after mapping the contributing pixel 
contents to the Ewald sphere. The varying intensity decline 
owing to the angular distance of the reflection from the 
surface of the Ewald sphere is estimated using a Gaussian 
rocking curve. This decline is dubbed 'Ewald offset correc- 
tion', which is well defined even for still images. Together with 
an image-scaling factor and other corrections, an explicit 
expression is defined that predicts each recorded intensity 
from its corresponding structure-factor amplitude. All diffrac- 
tion parameters, scaling and correction factors are improved 
by post-refinement. The ambiguous case of a lower point 
group than the lattice symmetry is resolved by a method 
reminiscent of the technique of 'selective breeding'. It selects 
the indexing alternative for each image that yields, on average, 
the highest correlation with intensities from all other images. 
Processing a test set of rotation images by XDS and treating 
the same images by nXDS as snapshots of crystals in random 
orientations yields data of comparable quality, clearly 
indicating an anomalous signal from Se atoms. 
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1. Introduction 

The availabihty of free-electron lasers (FELs) as a source of 
ultrabright X-ray pulses of femtosecond duration has provided 
a new approach for collecting diffraction data that makes 
extremely small and radiation-sensitive samples accessible to 
structural studies. It is expected that this new X-ray source will 
strongly contribute to our knowledge of the large group of 
biological objects such as membrane proteins that are difficult 
to grow as macroscopic crystals. 

Destruction of the irradiated object by a single pulse is 
much slower than the pulse duration, so that data can be 
collected at room temperature before the sample deteriorates 
owing to radiation damage. As a consequence many diffrac- 
tion snapshots are required, each from a different sample in a 
random orientation, to yield a complete data set. For crys- 
talline samples diffraction data are partially recorded on still 
images, which presents a challenging task to processing that 
traditional crystallographic programs were not designed to 
handle. 

This has led to the development of new software suites: 
(i) CrystFEL (White et ai, 2012, 2013), which uses MOSFLM 
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(Leslie & Powell, 2007; Powell et al, 2013) or DirAx 
(Duisenberg, 1992) for reflection indexing and merges indi- 
vidual intensity measurements (Kirian et al, 2010, 2011) to 
estimate the integrated intensity of each unique reflection, and 
(ii) the cctbx.xfel (Sauter et al, 2013; Hattne et al, 2014) and 
cctbx.spotfinder (Zhang et al, 2006) components within 
the Computational Crystallography Toolbox (cctbx). These 
programs rely on the Monte Carlo method for estimating 
reflection intensities, thereby assuming that other factors such 
as fluctuations in the incident-beam intensity, wavelength and 
spectrum as well as the irradiated volume of the specimen 
'integrate out' upon averaging (White et at, 2012; Kirian et al, 
2011). It has recently been demonstrated (Barends et al, 2014) 
that a de novo crystal structure can be determined from X-ray 
free-electron laser data analyzed by CrystFEL. The data 
comprise about 60 000 indexed images out of 2.4 million 
snapshots from a lysozyme heavy-atom derivative that gives a 
strong anomalous signal from two Gd atoms per asymmetric 
unit (Barends et al, 2014). 

Another complication that is not encountered in traditional 
data collection by the rotation method is the ambiguity in the 
choice of unit-cell basis vectors in cases where the lattice 
symmetry is higher than the crystal symmetry. The problem 
arises because the reflections recorded by each snapshot are 
indexed independently and their partial intensities are too 
inaccurate for any decision-making based on correlations 
involving different exposures (White et al, 2012). To circum- 
vent this problem, CrystFEL generates a perfectly twinned 
data set of higher symmetry by merging all snapshots, which 
renders the subsequent structure solution more difficult. 
Recently, new methods for breaking the indexing ambiguity 
have been described (Brehm & Diederichs, 2014) and the 
detour via artificially twinned data sets appears to no longer 
be necessary. 

Some of the problems in processing FEL snapshots had 
already been encountered more than 35 years ago when trying 
to solve the first virus crystal structures. Often, on account of 
radiation damage, the crystal had to be replaced after a single 
exposure covering only a small rotation range. This resulted in 
many partially recorded reflections and their complete inten- 
sities had to be estimated. A solution to this problem, the 
'post-refinement' method, was developed (Schutt & Winkler, 
1977; Rossmann et al, 1979; Harrison et al, 1985; Rossmann, 
1985) to derive complete intensities from refined estimates for 
the fractions of observed intensity, the 'partiality'. For rotation 
images the partiality of each reflection can always be calcu- 
lated as a function of orientation, the unit-cell metric, the 
mosaic spread of the crystal and a model of the reflection 
profile. The idea of the 'post-refinement' method is to improve 
these parameters after all images have been processed by 
using symmetry-related, fully recorded reflections for refer- 
ence. 

As an alternative to the processing of snapshots by Monte 
Carlo integration, the software package nXDS has been 
developed that resumes the old 'post-refinement' idea, now 
enriched by a model for the recorded snapshot intensities as 
a function of their corresponding structure-factor amplitudes 



and also including the correct treatment of stills. The unique 
reflection intensities and diffraction parameters for all crystals 
are refined simultaneously to minimize the discrepancy 
between observed and expected spot centroids and recorded 
intensities. The indexing ambiguity is resolved by a method 
reminiscent of 'selective breeding'. It selects the indexing 
alternative for each image that yields, on average, the highest 
correlation with symmetry-related reflection intensities from 
aU other images. The implementation of nXDS is derived from 
the standard XDS package (Kabsch, 2010a,Z>) in its present 
version, which includes the handling of multi-segment detec- 
tors. 

2. Description of the diffraction experiment 

Any right-handed orthonormal laboratory coordinate system 
{li, I2, 13) can be chosen with its origin at the intersection point 
of the beam and the crystal. This reference system serves to 
specify the following. 

(i) The monochromatic incident-beam wavevector Sq 
(wavelength A,, |So| - l/A.) pointing from the X-ray source 
towards the crystal. 

(ii) The fixed rotation axis m2 passing through the origin of 
the laboratory system. During X-ray exposure the crystal is 
uniformly rotated by some angle around this axis. Still 
images are treated as a limiting case where A^ = 0 and the 
concept of a rotation axis loses its meaning. 

(iii) The right-handed set of vmit-ceU basis vectors {bi, \>2, 
bs} of the single crystal and the associated reciprocal basis 
{bjjbj.b*} such that any reciprocal-lattice point can be 
expressed as po — /zbj + k\32 + where h, k, I are integers. 

(iv) The right-handed orthonormal detector system {Di, D2, 
D3) imagined to be fixed in the instrument and translated by 
the origin vector Dq. 

The detector consists of one or several rectangular, planar 
X-ray-sensitive segments. Their position and orientation is 
specified with respect to the detector system, which renders the 
specification independent of any detector movements. For a 
well calibrated instrument this greatly reduces the number of 
parameters that need to be refined during data processing. 

A right-handed orthonormal segment system |dj, A'2, ds) and 
an origin vector do are specified for each segment with respect 
to the detector coordinate system. The X-ray-sensitive area 
consists of identical pixels of sizes Qx, Qy (mm) along di and 
d2, respectively. 

Expressing the segment origin &o in terms of {di, d2, da}, the 
location x' of a segment pixel x, y with respect to the detector 
system is given by 

x' = xQ;fdi+yQyd^ + d;, 

= (x-4)Q;,d; + (y-y;)Gyd^+F'd^. (1) 

Thus, the laboratory coordinates x of the same pixel x, y are 

x = Do + (Di|D2|D3)x' 
= (x-Xo)Qxdi + 0'-3'o)Gyd2 + Fd3, (2) 

where 



Acta Cryst (2014). D70, 2204-2216 



Kabsch • nXDS 2205 



research papers 



d„ = (DilD^IDsK, (v = l,2,3) (3) 

and 

xo = - Do ■ AJQx, 
yo = Jo - Do ■ dz/Gy, 
F^F' + H.-A,. (4) 

The segment plane is found at a distance \F\ from the crystal. F 
has a negative sign if points towards the crystal. 

For accurate integration, the spot shape and extent are 
modelled as Gaussian distributions involving two parameters: 
the standard deviations of the reflecting range (mosaicity), CTm, 
and the combined effects of beam divergence and mosaicity, 

3. Processing steps of nXDS 

The program package nXDS was developed for automatic 

determination of scaled and fully corrected reflection inten- 
sities from monochromatic X-ray diffraction images, each 
from a different crystal in a random orientation. nXDS uses 
many ideas, routines and the overall structure of the rotation 
data-processing program XDS (Kabsch, 2Q\Qa,b) and includes 
numerous new routines for the efficient evaluation of a large 
number of snapshots. The algorithms developed for integra- 
tion, scaling and post-refinement are described in the 
following. 

The snapshots are processed by nXDS in seven steps, 
named XYCORR, INIT, COLSPOT, POWDER, IDXREF, 
INTEGRATE and CORRECT, which are called in succession 
by nXDS. Information is communicated between the steps 
by files, which allows the repetition of selected steps with a 
different set of input parameters without rerunning the whole 
program. 

3.1. XYCORR 

If necessary for the detector being used, this step computes 
lookup tables of spatial corrections for each detector pixel. In 
subsequent data-processing steps, when the true coordinates 
of a pixel with respect to the laboratory coordinate system 
are needed, the correction values for the X and Y coordinates 
are retrieved from the tables and added to the pixel's array 
coordinates in the data image. This step is essentially the same 
as that used by XDS. 

3.2. INIT 

Three lookup tables are determined here that are required 
by the subsequent processing steps for classifying pixels in 
the data images as untrusted, background or belonging to a 
diffraction spot. This step is virtually the same as that used by 
XDS. 

3.3. COLSPOT 

Strong diffraction spots occurring in the data images are 
located and their centroids are saved in a file. Compared with 
the corresponding step in XDS a simplification resulted for 



snapshots from the fact that there are no neighbouring images 
with contributions to the same spot. 

3.4. POWDER 

The origin of the detector system can often be found from 
a powder pattern generated from the spots located in the 
previous step. An incorrectly specified origin could lead to a 
misindexing of reflections, a risk that is particularly high for 
processing snapshots. As the use of multi-segment detectors 
can prevent direct recognition of the powder circles, the 
following method was devised. 

A plane with the incident-beam vector So as its normal is 
constructed from a second vector t = (1, 1, 1). To assure that t 
is non-collinear with Sq, one of its components is reset to 0, 
namely the component corresponding to the maximum abso- 
lute value of the components of Sq. If this is not unique, the 
first occurrence of the maximum is taken. For example, t = 
(1, 0, 1) if the y coordinate is the first occurrence of the 
absolutely largest component of Sq. A right-handed ortho- 
normal system (tj, t2, ts) can then be defined by 

ti=txSo/|txSo|, t2 = So Xti/|So Xtil, t3 = So/|So|. 

(5) 

Now, for a spot located at pixel coordinates x, y in the 
COLSPOT step a scattering vector S can be calculated, 

X = (x - Xq)QxAi + (y - yo)Gyd2 + FA^, 

S = x/A.|x|, (6) 

which intersects the powder plane at unit distance at coordi- 
nates 

S ■ ti/S -13, S ■ iJS ■ t,. (7) 

In the ideal case the scattering vectors mark a set of concentric 
rings: the powder pattern. The centre of the rings should be at 
the tip of the incident beam, which is the image centre. Often 
the centre of the powder rings is instead found to be offset 
from its ideal position, which can be interpreted to result from 
an incorrect value for the origin of the detector coordinate 
system Dq. 

3.5. IDXREF 

For each snapshot this step tries to find a crystal lattice that 
explains the observed diffraction spots by assigning indices to 
them and refines aU parameters. The step returns a hst of the 
successfully processed images and the indexed spots as well as 
the corresponding diffraction parameters. These parameters 
are refined for accurate prediction and integration of all 
reflections that are expected to occur in each snapshot. 

Short real-space lattice vectors are determined by the 
method of Steller et al. (1997). Extraction of a reduced cell, 
Bravais lattice determination and indexing of the observed 
spots is performed as described by Kabsch (1993). The 
computational methods were taken from XDS, with the 
exception of the refinement procedure, which had to be 
adapted for handling stills as well, where the concept of a 
rotation axis has no meaning. This is achieved by using a 
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different refinement target function. Instead of a rotation 
angle about a fixed camera axis, tlie smallest angle is used by 
which a reciprocal-lattice point could reach the Ewald sphere 
by some unrestricted rotation. A closed analytical expression 
for this angle is derived below and it is subsequently shown 
how it is used in the refinement procedure. 

3.5.1. Rotating a reciprocal-lattice point to the Ewald 
sphere. We would like to find the smallest rotation that moves 
a reciprocal-lattice point onto the Ewald sphere. Let pg denote 
any arbitrary reciprocal-lattice vector and Sq denote the 
incident-beam wavevector of length 1/A, {X is the wavelength) 
pointing from the X-ray source towards the crystal. Diffrac- 
tion occurs along the wavevector S when the crystal is rotated 
so that Po is changed into p* on the Ewald sphere satisfying 
the Laue equations 



S = Sq -I- p*, = So 



p*^ = -2So • p* = Po*^ (8) 



The distance vector A = p* — pg depends on the rotation used 
and thus is not unique. 

Extending earlier work (Kabsch, 1988Zj), a unique rotation 
can be found for a given So, pj that yields the shortest distance 
vector A = p* — Po, provided |pol < 2|So| and IpJ • Sgl < 
IPollSol- 

The unique rotation can be obtained by minimizing the 
function 

/(A) = A^ + Aii[(So + Po* + A)^ - S^] + n,[ip*o + \f - pf], 

(9) 

where the two constraints on the solution are enforced by the 
Lagrange multipliers pii, 1x2- At the minimum the gradient of 
/(A) must vanish and the matrix of second derivatives must be 
positive definite. 



2dA 



= A + Aii(So + Po* + A) + + A) 



= 0 



1 -I- Ml + M2 ' 



1 9^/ 

2 9A9A 



= (1 + Aii + M2)l=^ 1 + Aii+M2>0. (10) 



The Lagrange multipliers are adjusted so that the constraints 
are satisfied. At a vanishing gradient of/, the Laue equations 
are satisfied if 



Po = -2So • P = 2 — 



which implies 



*2 

Po 



l + Mi+/^2 2(miS^ - So • pg) 



(11) 



(12) 



Conservation of the length of the reciprocal-lattice point 
implies that 



„*2 

Po 



(Po* - H^Sof 



(p*o - MiSo) Po* 



(1 + Ml + n^f 4(MiSg - So • Po*)' ' 
which leads to a quadratic equation for mi. 



(13) 



l4 - 2^1 



So-Po* I 4(So-pg)'-pg^ 



= 0. 



The solution is 



From 



_ So ■ pg ^ Po*' 

/^l — c2 ^ 0^2 



Po 



S^Po*' - (So ■ Po*)' 



1 + Ml + M2 2(miS^ - So • p*o) 



«2„*2 . 



± 



4P0 



1/2 



'^OPO 4 Po 

Slpf - (So • Po*)' 



(14) 



(15) 



1/2 



(16) 



it can be concluded that the positive sign refers to the 
minimum of A^, while the negative sign corresponds to the 
maximum distance from the Ewald sphere. Using the abbre- 
viations 



A = 



B = 



1 -I- Ml + M2 

Ml 

1 -I- Ml + M2 



2„*2 



4P0 



.S?Po*' - (So • P5)' 



1/2 



= (ASo-Po* + Po*'/2)/S^, (17) 



the closest point p* on the Ewald sphere reachable by a 
rotation of Po is 



p* = Apo* - BSo, A = p* - p*, S = So + p* 



(18) 



Obviously, the vectors A, S, p* all lie in the plane defined by 

Po, So- We conclude that pj reaches the Ewald sphere at p* by 
the shortest path if it is rotated about an axis parallel to the 
plane normal and passing through the origin of reciprocal 
space. The rotation axis is identical to the 'y6-axis' (Schutt & 
Winkler, 1977) defined for each reflection to be perpendicular 
to both the incident-beam and diffracted-beam wavevectors. 

In the case of the rotation method the crystal is forced to 
rotate about a fixed axis which has a component ^ on 
the 'j8-axis'. f is called the reflecting-range expansion factor 
(Kabsch, 2010fe) because the angle to rotate a reflection to the 
Ewald sphere about m2 increases (by 1/f). 

Finally, the angular deviation of the reciprocal-lattice point 
Pq from the Ewald sphere can be defined as the vector 



T = radA/lp^l, rad = 180°/:^. 



(19) 



3.5.2. Initial refinement. For each image, let j enumerate 
the n strong spots with observed centroids X-, Yj, detector 
segment identifying number s, and reflection indices hj, kj, Ij. 
To each ;, we denote the reciprocal-lattice point 
p* = /z^b* -I- k^bl + /^bj, from which a diffracted beam wave- 
vector Sy, an angular deviation from the Ewald sphere r, and 
reflecting-range expansion factors can be determined as 
described in the previous section. With the detector segment 
recording the strong spot at distance F^i, orientation d/ , d,' , dj 
and origin X^^, 1^', the residuals between the calculated and 
observed spot centroids are (segment specified in the labora- 
tory system) 
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A'y = r- + F'Sj . d^i/Sj . 4' - y;. (20) 

The goal of the refinement procedure is to find the detector 
parameters, incident-beam direction and unit-cell basis 
vectors that minimize the target function 

E^w^ tiA'^f + wy tiA^yf + w,t r//[i(^;A^/2)^ + a^]. 

j=l j=l ;=1 

(21) 

The positional part of the target function pushes the calculated 
spot positions towards the observed centroids, while the 
angular part minimizes the angular deviations from the Ewald 
sphere. The variance for the angular part is expected to 
increase with the size of the rotation range. 

The residuals are expanded to first order in the parameter 
changes so that E becomes a quadratic function of these 
changes. Minimization then leads to a system of normal 
equations whose solution is used to update the parameters. 
During refinement, the cell metric obeys constraints imposed 
by the lattice symmetry by choosing an appropriate set of free 
parameters for representing the allowed changes of the basis 
vectors (details not shown). 

The minimization procedure is repeated keeping the same 
set of weights Wx, Wy, w^ until convergence is reached. New 
weights are then determined so that 

w^ = l/±{A>^)\ 

/=i 

Wy^l/±{A>yf, 
7=1 

m'. = 1/E^MI(W2)' + '^m]. (22) 

7=1 

Refinement continues with the new weights until convergence 
is reached again. The whole refinement procedure is termi- 
nated upon convergence of the weights Wx, Wy, w^. 

3.6. INTEGRATE 

Starting with the refined diffraction parameters for the 
successfully indexed data images, this step determines the 
recorded reflection intensities by two-dimensional profile 
fitting and saves the results on file for subsequent processing 
by the CORRECT step. No correction factors are apphed 
except for compensating missing parts in the reflection profile 
owing to overlap with bad pixels or closely neighbouring 
reflections. 

Similar to the integration procedure as implemented in 
XDS (Kabsch, 2010fo), analysis of each image consists of 
determination of the reflection spot size and the mosaicity and 
refinement of the diffraction parameters using the strong 
spots. This is followed by determination of a two-dimensional 
reflection reference profile and integration by profile fitting 
for all expected reflections, including the weak reflections 

3.6.1. Mapping image pixels to the Ewald sphere. As 
described earlier (Kabsch, 19886, 20106) and implemented in 



the program XDS (Kabsch, 2010fl), it is useful to represent the 
intensity distribution in a reflection-specific coordinate system 
{ci, 62, 63} so that all reflections appear as if they had followed 
the shortest path through the Ewald sphere and were recorded 
on the surface of the sphere. This eliminates reflection-specific 
differences in the intensity profile caused by the oblique 
incidence of the diffracted beam on a flat detector segment or 
by crystal rotation around a fixed axis for a rotation data 
image. 

For a reciprocal-lattice point Pg the nearest point S on the 
Ewald sphere is obtained as described above, so that a 
reflection-specific coordinate system can be defined as 

ei = Sx So/is X Sol, 
62 = S X Bj/IS X ej, 

e3 = (S + So)/|S + So|. (23) 

It has its origin at the terminus of the diffracted-beam wave- 
vector S and therefore could move depending on the specific 
path of Pq through the Ewald sphere. The unit vectors Ci and 
62 are tangential to the Ewald sphere, while 63 is perpendicular 
to Ci and p* = S — Sq. 

Diffraction along wavevector S' in the neighbourhood of S 
is recorded at pixel X',Y in the detector plane. This pixel is at 
a distance D from the crystal. With the detector at a distance 
F, an orientation di, d2 and an origin Xq, Yq of the detector 
plane, we have 

D = [(X' - X,f + {¥' - Yof + F^f\ 

S' = [{X' - X,)d, + (Y' - yo)d2 + Fd,]/i\D), (24) 

The corresponding coordinates ei, 82, £3 in the reflection- 
specific system on the Ewald sphere are then 

ei = ei-(S'-S)/|S|, 
e2 = e2-(S'-S)/|S|, 

S3 = e,-(p;-p*)/\v*\. (25) 

In the case of a rotation image, a reciprocal-lattice point pg 
needs a larger angle e3/|ei-m2| to reach the Ewald sphere 
because the movement is restricted by the fixed rotation axis 
m2. The reflecting-range expansion factor f = ei-m2 corrects 
for this effect. It is closely related to the reciprocal Lorentz 
correction factor for rotation images, 

L-i = |m2 • (S X So)|/(|S| • ISol) = |^ sin/(S, So)|. (26) 

The reflection-specific coordinate system defined above sets 
up a one-to-one correspondence between the points of a 
region R of the £162 plane tangential to the Ewald sphere and 
a region R' of the XY plane of the detector segment. The 
Jacobian of the transformation is 

diX',Y') dX'dY' dY'dX'' ^ ' 

From 

ds: 9A.S' 9£, 9A.S' , X . X 

one finds 
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9AS'\ / dXS'\ 



dX'J V ' dY'J 

, /dXS' dXS'\ 
(e, X e,) • I X I . 



dXS'\ / dXS'\ 



(29) 



Using 

dXS' _ di (XS')(X' - Xo) dXS' _ d2 (XS')(Y' - Y^) 



£»2 



one finds 



dXS' dXS' F . 

X = — A.S' 

dX' dY' Z)3 



and tlie Jacobian reduces to tlie simple expression 
F 



J= (XS')-(XS). 



(30) 
(31) 

(32) 



Thus, if the region R' of the XY plane of the detector covers a 
single pixel, the corresponding area R in the £162 plane is the 
mean value of |/| times the pixel area. 



^ = jj" dei ds2 = ff \J\ dX' dY' 

R R' 

~ \J\ff dX'dY' = \J\R'. 

R' 



(33) 



3.6.2. Intensity recorded by a detector pixel. Because of 
crystal mosaicity and beam divergence, the intensity of a 
reflection is smeared around the diffraction maximum. 
Following earlier work (Kabsch, 1988fo, 2010i>), the fraction 
of the total reflection intensity found in the volume element 
dsids2ds3 at ei, £2, £3 is modelled as the product of two 
functions: 

(w(£i, £2, s^dSids2dS'3, = C0i2(£i, £2)d£i^^£2 ■ 0^3(^3)^8^. (34) 

The first function aii2(£i, £2) describes the expected intensity 
profile of the reflections mapped to their specific £1, £2 plane 
tangential to the Ewald sphere. Initially, Gaussians are 
assumed with the parameter ctq modelling the combined 
effects of beam divergence and mosaicity as 



exp(-£ ^/2ag)) exp(-El/2al) 
(In) 



1/2, 



(27r)'''VD 



(35) 



(To is estimated from the observed variance in intensity of 
scattered rays for the strong reflections and provides infor- 
mation on the spot width. In a second step the initial Gaussian 
form for (U12 is replaced by the superposition of the profiles of 
all strong reflections and is used for definition of the final 
integration region included in profile fitting of all reflections 
predicted to occur in the diffraction image. 

The second function, the rocking curve (1)2,(82), models the 
dependency of the reflection intensity on the angular distance 
from the surface of the Ewald sphere as a Gaussian with the 
mosaicity CTm of the crystal as the standard deviation. An 
estimate of CTm is found as described previously (Kabsch, 
20106). If T = rad|A|/|pJ| is the angular deviation of the 
reciprocal-lattice point Po from the Ewald sphere, the fraction 



of intensity recorded on the image, i.e. the partiality of the 
reflection, is 



q = /a>3(£3 - t) d£3 = / 



exp[-(£3 - r) /2ali\ 



d£3. (36) 



The integration extends over the rotation range of the 
spindle during exposure of the image multiplied by the 
reflecting-range expansion factor which corrects for the 
increased path length of the reflection through the Ewald 
sphere when rotated around a fixed axis, i.e. s = ^- A^/2. Using 
the dimensionless variables t = r/l^'^a-M and z = s/l'^'^a^i the 
partiality of the reflection can also be given in terms of the 
error fimction erf: 

z)= f ^""^^'^.^2^'^'^ dz' = [erf(^ + z) - erf(f - z)]/2. 



(37) 



Together with the above expression for the Jacobian /, the 
expected fraction of reflection intensity recorded by pixel 
X, Y in the detector plane is, according to our model, 

s 

If <^i2isu S2) d£i d£2 / a>3(£3 - t) d£3 ~ ft)i2(£i, S2)\J\R'q. 

R -e 

(38) 

Here, ft>i2(£i , £2) and J denote mean values in the region R of 
the £[£2 plane that maps to the region R' of the pixel at X, Y 
in the plane of the detector. 

3.7. CORRECT 

After resolving possible indexing ambiguities, the raw 
intensities of the (reindexed) reflections as obtained from the 
previous step are scaled and corrected by a modified post- 
refinement procedure and then saved in the final reflection 
output file for subsequent structure-solution software 
packages. 

As described below, the new concept of the Ewald offset 
correction factor as a replacement for partiality links raw 
intensities to structure-factor amplitudes, rendering classical 
post-refinement applicable even to still snapshots. 

3.7.1. Correction factors. A reflection intensity /, as 
recorded on a snapshot and returned from the INTEGRATE 
step, is proportional to the 'true' intensity / (the squared 
structure-factor amplitude). 



I=CI, C=CT, C=QLPAO. 



(39) 



The correction factor C consists of a factor C that is not 

affected by possible indexing ambiguities and a factor T that 
accounts for different scale and resolution faU-off between the 
snapshots. 

C is a product of factors modelling the Ewald offset 
correction (Q), Lorentz factor (L), polarization (P), air 
absorption (A) and sensor-thickness correction (O). These 
corrections are functions only of the diffraction parameters of 
the snapshot that recorded the reflection. 

Ewald offset correction. The recorded intensities of the 
reflections of the same image would be directly comparable if 
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they simultaneously obeyed the Laue equations, i.e. r = 0 for 
all reflections. The idea is to correct the intensity by the esti- 
mated decline owing to the angular distance of the reflection 
from the surface of the Ewald sphere. Using the dimensionless 
variables t = T/2^'^crM and z = sII^'^Gm (s = ^- A^2) this leads to 
the definition of the Ewald offset correction 

Q(t, z) = q(t, z)/q(0, z) = \ [erf(f + z) - erf(f - z)]/erf(z), 
Q(f, 0) = lim Q{t, z) = expi-f), (40) 

z-»0 

which is well defined even for still images (z = 0). Q(t, z) can 
be calculated for each reflection from the incident-beam 
wavevector So, the reciprocal basis {b|, b|, h^], the mosaicity 
CTm and the oscillation range A^. 

Lorentz factor. Assuming an infinitely thin Ewald sphere, 
the Laue equations can be satisfied for a reciprocal-lattice 
point by a variation in the wavelength or the direction of the 
incident beam relative to the crystal. For an ideal crystal the 
scattered intensity is sharply concentrated around the direc- 
tion of the diffraction maximum with a solid angle much 
smaller than the finite aperture of a detector pixel. Conse- 
quently, only integrated intensities can be observed: these are 
related to their squared structure factors by the Lorentz 
correction. Explicit forms of the corrections are available (see, 
for example, Zachariasen, 1945) for aU conceivable methods 
of recording sharp diffraction maxima. 

(i) Rotation method. For a fixed wavelength but a variable 
direction of incidence (accomplished by rotation of the crystal 
around a fixed axis and a fixed incident beam), the correction 
factor is 

L = l/|Csin(2e)| = |S||Sol/|m2 • S x Sol, 29 = /(S, Sq). 

(41) 

For the special case that the rotation axis is perpendicular to 
both the incident and diffracted beams, i.e. f = 1, the Lorentz 
correction simplifies to 

L = l/|sin(2e)|. (42) 

Still snapshots can be considered as a limiting case in which 
all reflections move infinitesimally through the Ewald sphere 
along their shortest routes. 

(ii) Laue method. For a variable wavelength but a fixed 
direction of the incident beam, the correction factor is 

L = l/(2sm^e). (43) 

(iii) Powder method. For a fixed wavelength but a variation 
of the incident beam with two degrees of freedom (accom- 
plished by mosaic crystals), the correction factor is 

L = 1/(4 sine). (44) 

Lorentz correction is always a simple function of resolution 
and therefore does not affect the agreement among intensities 
of symmetry-related reflections. 

Polarization. The intensity of scattering from the crystal is 
proportional to the polarization factor 



P=(sinV), (/J=Z(Eo,S), (45) 

where Eo denotes the electrical field vector of the incident 
beam and S the diffracted beam wavevector. If n denotes the 
polarization plane normal and p the probability of finding the 
field vector Eq in this plane, 

P = (sin^ (f) = P sin^ -|- (1 — p) sin^ (P2, 
q>, = Z(So X n, S), 

V, = Z[(So X n) X So, S]. (46) 

For an unpolarized incident beam, the electrical field vector 
Eq is found with equal probability pointing along So x S or 
(So X S) X So, so that 

^1 = n/l, = n/2 - 29, 29 = /(S, Sq). (47) 

This leads to 

P=[l+ cos^(26i)]/2. (48) 

The chosen parametrization allows description of the effect of 
polarization for most data-coUection scenarios (Kahn et al, 
1982). Values for the two parameters n and p are provided by 
the user (not refined). 

Air absorption. The recorded intensity is reduced from its 
'true' value owing to air absorption of the diffracted beam by 
the factor 

A = exp(-/xD), (49) 

where lu. denotes the fraction of intensity loss per millimetre 
and D is the distance (in millimetres) between the crystal and 
the position of the reflection spot on the detector segment, fx is 
a wavelength-dependent input constant (not refined). 

Sensor-thickness correction. The recorded intensity is 
increased owing to the effect of the oblique incidence of the 
diffracted beam on a sensor of finite thickness. Let S denote 
the thickness of the sensor and k the fraction of intensity loss 
per millimetre. If the diffracted beam makes an angle co with 
the segment normal, the probability of a photon penetrating 
the sensor undetected is exp(— /c5/cosa)). Thus, the correction 

_ 1 - exp(-K8/ cos CO) 

1 - CXpi-KS) ^ ^ 

accounts for the oblique incidence of the diffracted beam. 
Values for S and k are provided by the user and are not refined. 

Scaling and temperature factor. Using the parameters g and 
B, the factor 

r = gexp(5|p*|^) (51) 

puts the 'true' intensity for reciprocal-lattice point pg on the 
same scale as the observed one in the data image. The values 
of g and B adjust differences in beam intensity and in crystal 
disorder and volume in a resolution-dependent way. This 
correction factor is obtained by comparison of symmetry- 
equivalent reflections from different images and therefore 
depends on a consistent indexing choice. For resolving 
possible indexing ambiguities, correlations between equiva- 
lent reflection pairs from different snapshots must be 
computed (Brehm & Diederichs, 2014). To manage a 
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potentially large number of snapshots and reflections, an 
efficient solution was implemented in nXDS. 

3.7.2. Efficient calculation of correlation factors. A 

procedure was developed for calculating correlation factors 
in which the number of operations is proportional to the total 
number of recorded reflections of the compared snapshots. 
The procedure consists of two parts: initialization of the data 
structures and calculation of the correlation factor between 
the unique intensity estimates from the compared snapshots. 
Initialization. 

(i) For the given space group and unit-cell parameters all 
possible unique reflections within a given resolution range are 
generated. A prime number is determined that is slightly 
larger than twice the number of possible unique reflections 
and is used to allocate space for a hash table of this size. 

(ii) For each possible unique reflection a positive 64-bit 
integer is constructed from its unique indices and assigned a 
definite address in the table using the technique of hash-key 
transformations with quadratic probing to resolve key colli- 
sions (Wirth, 1976). 

(iii) For each reflection from the snapshots the unique 
indices are determined and coded by their hash-table address, 
which is saved as an auxiliary reflection attribute. Thus, two 
reflections are symmetry-related only if they have identical 
hash-table addresses. 

(iv) Four auxiliary arrays of the size of the hash table are 
aUocated: two for each data set to be compared. They are 
needed for calculating imique intensities and their variances 
for the reflections of the two data sets. 

Correlation. 

(i) Unique intensities and variances are estimated from 
symmetry-related reflections of the first snapshot by updating 
the contents of their associated hash addresses in the first two 
auxiliary arrays. 

(ii) For the intensity data from the second snapshot the 
procedure is repeated, this time updating the second two 
auxiliary arrays only if there is a positive entry from the first 
snapshot at the same hash address in the first two auxiliary 
arrays. 

(iii) Pairs of corresponding unique reflection intensities are 
obtained easily by scanning the second two auxiliary arrays for 
a positive contents. 

Thus, the total number of operations for calculating the 
correlation factor between one snapshot and all others is only 
proportional to the total number of recorded reflections of the 
snapshots. 

3.7.3. Indexing alternatives. For a given space group and 
unit-cell parameters, a reciprocal cell basis in some reference 
orientation (Kabsch, 1988a) is first defined and serves as a 
reference basis. As each image is indexed and integrated 
independently, it often happens that the reflection indices 
refer to different (reduced) cells. The problem is to find for 
each image the set of possible reindexing transformations that 
allow the reflections to be described in terms of a rotated 
version of the reference basis. 

This set of reindexing transformations is determined for 
each image using the following procedure. From the reciprocal 



basis vectors used for indexing the reflections of the image in 
the INTEGRATE step, a reduced ceU and its reciprocal cell 
are determined. The three reciprocal vectors thus obtained are 
considered as reflections that need to be indexed with respect 
to a rotated version of the reciprocal reference basis. Possible 
indices of the reduced-cell reciprocal vectors with respect to 
the reference basis are found by simply testing all possibilities 
involving indices absolutely smaUer than 4. For each assign- 
ment of indices a residual error for the best superposition with 
the reference basis is determined (Kabsch, 1976, 1978) and 
is used as a measure of the quality of the indexing. Index 
assignments related by symmetry of the reference cell are 
omitted, so that a list of symmetry-independent interpreta- 
tions remains. This list is sorted by increasing r.m.s. of the 
superposition with the reciprocal reference cell. Entries in the 
list with an r.m.s. larger than some multiple of that of the first 
item are omitted. 

The list may be empty if no reasonable interpretation is 
found for the basis vectors of the image. In this case the image 
is omitted from further calculations. 

3.7.4. Resolving the indexing ambiguity. Ideally, only one 
symmetry-independent solution remains, identified using 
the above procedure solely by geometrical considerations. 
However, for merohedral and pseudo-merohedral crystals, 
where the lattice symmetry is higher than the symmetry of the 
point group, more than one choice for the reindexing trans- 
formation exists. In nXDS the indexing ambiguity can be 
resolved by using one of two methods. Both methods rely 
on correlation factors between intensities that have been 
corrected by C for various effects as described above. Note 
that the scaling corrections T are not needed here. 

Comparison with a reference. For each snapshot aU possible 
reindexing transformations are tested and the indexing choice 
yielding the largest correlation factor with the given reference 
data set is selected. Here, symmetry-equivalent reflection 
intensities from the snapshot as well as from the reference 
data are merged separately prior to calculation of the corre- 
lation factor. Moreover, an initial scaling factor is determined 
at little additional computational effort that puts the inten- 
sities from each snapshot on the level of the reference. 

Selective breeding. If no external reference data set is 
avaflable, a solution is found by a method that is reminiscent 
of the technique of selective breeding. The method initiates a 
cyclic procedure with some arbitrary indexing transformation 
from the list of possibihties assumed by each snapshot. For 
each cycle the following steps are carried out. 

(i) For each snapshot all of its possible indexing choices are 
tested in succession, with the reindexed reflection intensities 
treated as a hypothetical reference data set. The mean value of 
the correlation factors with all other snapshots is calculated, 
and the running number of the reindexing choice yielding the 
largest correlation factor is saved. If several choices result in 
the same value for the correlation maximum, the first one in 
the list is selected. 

(ii) At the end of the cycle the list of optimal indexing 
choices just determined is used and replaces the previous 
selection. For some snapshots the new running number of the 
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reindexing choice may differ from that of the previous cycle. 
If none of the snapshots needs to be assigned to a different 
indexing choice than before, the cyclic procedure terminates. 

This procedure usually terminates within ten cycles. The 
moderate amount of storage needed is proportional to the 
number of snapshots and to the size of the hash table. In 
addition one array is needed that keeps the hash-table address 
of the unique indices for each reflection of the whole set of 
images. This speeds up the computation of correlation factors 
between all pairs of snapshots. Within each cycle these 
computations can be carried out very efficiently and in parallel 
by a team of processors. 

3.7.5. Scaling. We assume that all reflections have been 
consistently indexed. The reflection intensities of each image 
are still on different scales owing to differences in the intensity 
of the incident beam or irradiated crystal volume. Determi- 
nation of the scale factors by the method described here is 
based on intensity estimates for the unique reflections occur- 
ring on each image, 



(52) 



where v enumerates the symmetry-equivalent reflections and 
and the recorded intensities on the same image and their 
estimated standard deviations, respectively. C„ denotes the 
correction factors as described above. A scaling correction 
factor for each image is determined by least-squares mini- 
mization using common unique reflections with a positive 
intensity that occur in more than one image. 

Let h enumerate the different unique indices of the 
reflections involved in scaling and I enumerate the «; images 
from which the reflections come. Let ;' enumerate the n unique 
reflections included. To each /, we denote intensity /, > 0, 
variance ct?, unique reflection index hj and image /,. The goal 
of the scaling procedure is to find factors g/ > 0 and mean 
intensities //; > 0 by minimizing the target function (Hamilton 
et al, 1965) 



(55) 



Therefore, unique directional minimizers can be defined by 
equating the gradients to zero: 



UG) = E w^iJ^ - G^)KhJ E ^hn^- (56) 

7=1 7=1 

The target function is monotonically reduced by a cyclic 
procedure of alternating minimizations along the / and G 
directions. The procedure is initiated at G; = 0, //,(G = 0). In 
each following cycle new scaling factors are found as 



G; - G, = c(G, - G,), 



(57) 



with the step size c chosen to maximize the reduction of the 
target function. Expansion of the quadratic target function at 
the point G, /(G) yields 

vI'[G, /(G)] = *[G, /(G)] + E w/G, - G, )' 

vI'[G', /(G)] = ^\G, /(G)] + E w/G' - G^f (58) 

;=1 

and the resulting reduction in the target function is 
*[G, /(G)] - *[G', /(G)] = [!-(!- cf\ t wm - G.f. 

(59) 

Moving from G to G' changes the mean logarithmic intensities 
by 



V^(g. = E ^r-^ 



(53) 



To guarantee success of the solution method for a large set of 
snapshots and possibly large variations in their scahng factors, 
the target function is modified by using logarithms. 



/,(G') - /,(G) = -E M'/(Gi. - G,.)^./ E H'y^. 

j=i j=i 

= c[/,(G)-/,(G)]. 
Expansion of ^ at point G', /(G') yields 



(60) 



*[G', /(G)] = *[G', /(G')] + E H-yK (G) - J, (G')f. (61) 

i=i 

Using the abbreviations 



<I'(G,/) = E'^M 

7=1 



(54) 



defining Jj — In/y, Wj — Ij/Sj, Gi = Ing; and Jf, - In 4. This 
target function is quadratic, with a constant matrix of second 
derivatives and positive diagonal elements: 



a = E H'/G,. - G,)\ b = E >v,[/,.(G) - 4.(G)]^ (62) 
7=1 7=1 

the reduction in the target fimction at completion of one cycle 
is 

*[G, /(G)] - ^[G', /(G')] = 2ac -(a- b)c^. (63) 
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A finite value for the reduction requires a > b, which leads 
to an optimal step size c = a/{a — b), so that the target function 
is reduced by *[G, /(G)] - ^[G', J(G')] = ac. 
For the gradient, we have 



d^[G, /(G)] 
dGi 



J2^jiGi - G'i)8ii 



9*I'[G,/(G)] 



= 0. 



(64) 



Since the target function is non-negative, the procedure 
generates a bounded monotonically decreasing sequence 
converging to the minimum of the target function at vanishing 
gradient. As shown above, this also implies convergence of the 
sequence of scaling factors. 

Obviously, the solution thus obtained is not unique because 
the target function does not change its value if one adds 
an arbitrary constant to the logarithmic scaling factor G; 
and subtracts an appropriate constant vector from /;,. This 
amounts to just changing the common scale of all images in the 
original problem, which is of no importance as we are only 
interested in the relative scale. 

It may happen that several sets of images exist that are not 
connected by common measurements. In this case the target 
function could be thought of consisting of a sum of the same 
type of target functions, one for each unconnected subset of 
images. Now there is an arbitrary common factor for each 
subset. Apparently, the presence of arbitrary common factors 
for each subset of images does not prevent convergence of the 
cyclic solution procedure described here. 

3.7.6. Post-refinement. As mentioned above, the diffraction 
parameters of each image are refined in the IDXREF and 
INTEGRATE steps to minimize deviations between the 
observed and the predicted locations of the strong spots and 
to minimize their angular distance from the Ewald sphere. The 
angular part of the target function takes care of the fact that 
reflections can be visible only if they are close to the Ewald 
sphere. In addition, the distribution of r angles thus obtained 
provides an initial guess for the crystal mosaicity ctm- 
However, the angular part of the initial refinement target 
caimot account for the fact that very strong reflections can still 
be observed even if they are farther away from the Ewald 
sphere than the weaker reflections. This leads to a systematic 
bias in the initial parameter refinement so that strong reflec- 
tions will be predicted to be closer to the Ewald sphere than 
they really are. 

These deficiencies can be overcome when all images have 
been processed and intensity estimates for the recorded 
reflections are included in the refinement, which explains why 
this approach has been dubbed 'post-refinement'. The original 
idea (Schutt & Winkler, 1977; Rossmann et al, 1979; Harrison 
et al, 1985; Rossmann, 1985) is extended here to handle still 
snapshots as well when fully recorded, measured reflections 
are not available for comparison and the notion of 'partiality' 
loses its meaning. Its role is assumed by the Ewald offset 
correction Q defined above that is applicable for rotation 
images as well as stills. The 'post-refinement' variant 
implemented here considers the possible unique reflection 



intensities as free parameters that are to be refined along with 
the diffraction parameters of each snapshot (Bolotovsky et al., 
1998). 

The goal of the refinement procedure is the minimization of 
the target function 



E^wxT ^ii^'x) +^yJ: wy(AV)' + YXih - %)/<3}r . 

;=1 i=\ j=l 

(65) 

Here again, / enumerates the n recorded reflections from all 
snapshots, A^-, A'x the residuals between the calculated and 
observed spot centroids (see §3.5.2) and Ij, ct, the recorded raw 
intensity and its standard deviation. If a spot; is strong enough 
so that a centroid could be determined then Wj = l, otherwise 
wj = 0. 

Each spot / is associated with a reciprocal-lattice point pj, 
close to the Ewald sphere. For each observation a correction 
factor Cj can be computed from the diffraction parameters 
that relates the recorded intensity Ij to a unique reference 
intensity 4, where hj denote the unique indices of the 
reciprocal-lattice points pjy. 

The target function E depends on private parameters for 
each snapshot and global parameters and constants. 

(i) Private parameters. 

So, the incident-beam wavevector. 

\b\, b|, bj}, the reciprocal cell basis vectors. 

g, the scaling factor for intensities. 

B, the isotropic temperature factor. 

Oyi, the mosaicity. 

detector parameters. 

(ii) Global parameters. 

Ih, the squared structure-factor amplitudes for the possible 
unique reflections h (up to some global constant irrelevant in 
this context). 

(iii) Global constants. 

a and p, the polarization plane normal and the degree of 
polarization. 

/X, the fraction of intensity loss per millimetre in air. 
S, the thickness of the detector sensor. 
K, the fraction of intensity loss per millimetre in the sensor, 
and A^, the rotation axis and oscillation range (A^ = 0 
for 'stills'). 

Each refinement round starts with the determination of the 
unique reflection intensities //,, keeping the current parameter 
values constant. Minimization of the target function yields 



h = i:hHijq/af/j:s,>,cj/af. 

7=1 7=1 

The weights are then calculated as 

H'7 = l/D(/;-C/J/a/ 



(66) 



(67) 
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Table 1 

Comparison of data processing with XDS and nXDS. 



-IVCoUIUllUII 


15-2 


15-6 


6-4 


4—2.1 


2.1-2 


XDS 












Reflections 


356854 


12414 


31981 


264002 


48457 


Multiplicity 


7.8 


7.7 


7.8 


7.8 


7.7 




38.6 


90.9 


83.3 


35.6 


12.2 


CC„2 (%) 


100.0 


100.0 


100.0 


100.0 


99.2 


CQ,„„ (%) 


76 


98 


96 


71 


34 


nXDS 












Reflections 


3409453 


56009 


231474 


2610196 


511774 


Multiplicity 


74.6 


37.9 


57.4 


76.8 


82.7 


(//cr(/)) 


16.9 


24.9 


30.2 


16.8 


6.7 


CC„2 (%) 


99.9 


99.8 


99.9 


99.8 


97.2 


CCano (%) 


41 


84 


72 


38 


18 


Table 2 












Wall-clock times (s) for processing with XDS and nXDS. 




Step 


XDS 


nXDS 




Comments 




XYCORR 


1 


1 








INIT 


172 


124 








COLSPOT 


82 


1864 




I/O limited 




POWDER 




7 








IDXREP 


3 


1206 








INTEGRATE 


2022 


1704 








CORRECT 


24 


43 




Using a reference 






808 




Using selective breeding 



and are kept throughout the refinement round. The whole 
procedure is terminated upon convergence of the weights Wx, 

Wy, Wj. 

During a refinement round the diffraction parameters for 
each snapshot are corrected iteratively to minimize the target 
function until convergence is reached. The residuals are 
expanded to first order in the parameter changes so that E 
becomes a quadratic fimction of these changes. Minimization 
then leads to a system of normal equations whose solution is 
used for updating the parameters. The gradients of Aj^-, Ay 
and Cj are computed from analytic expressions (not shown). 

Fortunately, these calculations are independent for each 
snapshot and can be performed in parallel by a team of 
processors. Moreover, the memory requirements are almost 
negligible even when the refinement of detector parameters is 
included. 



4. Example of data processing with nXDS 

As an example to demonstrate the quality of data processed 
by nXDS in comparison to conventional data reduction by 
XDS, 20 000 consecutive rotation images were collected at 
100 K from a crystal of a selenomethionine-labelled double 
mutant of the RNA-processing factor SCAF8 (Becker et al, 
2008). Each image covers a rotation range A^ of 0.02° and 
is treated by nXDS as a snapshot taken from a randomly 
oriented crystal. The images were collected on beamline 
XIOSA at the Swiss Light Source, Vilhgen, Switzerland at a 
wavelength of 0.9779 A, slightly above the Se K edge. The 
images were recorded by a PILATUS 6M pixel detector 
(Dectris AG, Baden, Switzerland) located at 300 mm distance. 



Table 3 

Consistent indexing by selective breeding. 

Generation Replaced misfits 

1 10045 

2 9740 

3 6380 

4 114 

5 0 



The crystal has PAj, space-group symmetry, which is lower than 
the 422 lattice symmetry, implying a twofold indexing ambi- 
guity 

The processing results are summarized in Table 1. The 
upper part refers to the evaluation of the images by XDS as 
conventional rotation data. The lower part shows the corre- 
sponding quantities as obtained from nXDS. Here, reflections 
were only included if their Ewald offset correction was larger 
than 0.7. A total of 356 854 reflections in the resolution range 
15-2 A were integrated by XDS, so that for each unique 
reflection almost eight symmetry-related reflections are 
available. Because contributions to each reflection are also 
recorded by adjacent images, it is not surprising that nXDS 
found almost ten times more reflections, which is consistent 
with the correspondingly higher multiphcity of observations. 

After merging symmetry-related reflection intensities one 
might expect that the mean signal-to-noise ratio (Her (I)) 
would come out about the same regardless of whether the 
images were processed by XDS or nXDS. This is not the case: 
the mean signal-to-noise ratio is higher by a factor of almost 
three for the results from XDS. The lower accuracy of nXDS 
presumably results from two-dimensional instead of three- 
dimensional profile fitting and the lack of other corrections 
not carried out yet by this version of nXDS. The nearly perfect 
correlation factors CC1/2 (Karplus & Diederichs, 2012) 
between intensities of symmetry-related reflections obtained 
from processing by both programs reflect the excellent quality 
of the data images. The presence of anomalous scatterers 
is clearly indicated in both processing results by the highly 
significant value for the anomalous correlation CCgno- Finally, 
the reflection intensities obtained from both programs are in 
excellent agreement, showing a correlation coefficient of 98%. 

Data processing was carried out by a 12-core machine with 
16 GB memory running under Linux. Making use of the 
hyperthreading capability, up to 24 threads were employed for 
processing the images stored on a local disk. Elapsed wall- 
clock times for each step are listed in Table 2. COLSPOT uses 
a very fast spot-finding procedure but spends most of its time 
waiting for the next image to arrive. On average only three out 
of 24 threads were active. COLSPOT is a time-consuming step 
in nXDS because each image had to be analyzed for diffrac- 
tion spots since the knowledge that the images comprise a 
rotation data set was not used. In contrast, XDS only requires 
spots from a small fraction of the images for recognizing the 
crystal lattice and for accurate refinement of the diffraction 
parameters. For the same reason, the IDXREF step in nXDS 
takes much longer than in XDS. In the INTEGRATE step 
nXDS is somewhat faster than XDS because of the reduced 
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Table 4 

Phasing statistics. 





XDS 


nXDS 


SHELXD 






Correct solutions for 100 trials 


100 


96 


PATFOM 


21.41 


14.38 


CQ,„ (%) 


56.07 


38.31 


CC„„ak (%) 


34.51 


21.99 


SHELXE 






CQ„, (%) 


71.53 


62.25 


Heavy-atom peaks (map units a) 






Strongest (a) 


44 


34 


Weakest (cr) 


19 


13 


Average (cr) 


36 


27 


Highest noise (o) 


10 


8 



overhead in control when only single images are involved. 

Compared with XDS, the CORRECT step of nXDS takes 
longer because of the much larger number of reflections. 
Furthermore, additional computations are required by the 
selective breeding procedure when no reference data set is 
available. According to Table 2, the breeding procedure 
required about 14 min to resolve the twofold indexing ambi- 
guities for 20 000 snapshots and a total of 3.4 million reflec- 
tions. 

Details of the procedure are shown in Table 3 as the number 
of misfitting snapshots. In the first two generations both 
indexing alternatives are nearly randomly distributed. A 
small fluctuation towards one choice builds up in the third 
generation and quickly dominates the population. After five 
generations a homogeneous population is obtained. 

Phasing of single anomalous diffraction data was performed 
for the XDS and nXDS processed data sets using the 
SHELXCIDIE program suite (Sheldrick, 2010). The results 
are summarized in Table 4. In brief, the marker-atom structure 
factors were estimated from pre-merged data using SHELXC. 
Subsequently, the selenium substructure was determined in 
a search of 100 trials for ten putative sites while applying a 
high-resolution cutoff at 2.5 A. For both data sets, SHELXD 
identified 14 positions, of which seven showed significantly 
higher occupancy when compared with the less significant 
positions. This is in good agreement with the expected eight 
selenium positions (Becker et al, 2008). Phases were further 
improved by density modification using SHELXE; eight sites 
were refined to significant occupancy and the phases obtained 
resulted in excellent electron-density maps with high pseudo- 
free correlation coefficients CCfree (Table 4) and the correct 
enantiomorphic setting. Peak heights at the eight heavy-atom 
sites are highly significant and are weU above the largest noise 
peak. Although phasing for both data sets was unambiguous, 
the data set processed with nXDS showed slightly lower 
correlation coefficients, Patterson figures of merit (PATFOM) 
and heavy-atom peak heights. 

5. Conclusion 

This study describes a new approach for processing a large set 
of snapshots from randomly oriented crystals that does not 



rely on the Monte Carlo method of integration. Instead, the 
concept of the Ewald offset correction factor was devised to 
overcome difficulties arising from the use of partiality for 
modelling reflection intensities recorded by snapshots. The 
new approach has been implemented in the program nXDS 
that has borrowed many ideas and routines from the rotation 
data-processing package XDS as well as from the powerful 
post-refinement technique that has been in widespread use for 
several decades. 

The implemented Ewald offset correction rehes on a 
Gaussian model for the rocking curve, assuming a sufficiently 
large crystal whose shape transform can be ignored. In this 
case the exact functional form of the curve is not critical for 
reflections sufficiently close to the Ewald sphere. In the test 
data set, a reflection was only included if its Ewald correction 
factor was larger than 0.7. 

As shown for the test case with fine-shced rotation images 
of excellent quality, nXDS delivers results almost approaching 
those obtained by XDS and is able to retrieve the anomalous 
signal from a selenomethionine-labelled protein crystal. The 
source of the lower accuracy of nXDS is not yet clear. It could 
result from two-dimensional instead of three-dimensional 
profile fitting and the omission of information from weak 
contributions to reflections further away from the Ewald 
sphere that are used only by XDS. In fact, a small improve- 
ment in overall data quahty by 0.9% in {Ila{I)) and 2.9% in 
CCano was observed upon the inclusion of weaker contribu- 
tions when the minimum required Ewald offset correction was 
lowered from 0.8 to 0.7. 

So far, no 'real' FEL data have been processed by nXDS. 
These data typically vary for each snapshot in wavelength, 
bandwidth and crystal parameters. Although nXDS allows 
some of these to change for each snapshot, program modifi- 
cations are likely to become necessary when the incident- 
beam bandwidth can no longer be substituted by a mean 
wavelength or if the shape transform of the crystals cannot be 
ignored. Presently, nXDS can only accept images that XDS 
can read. Work is in progress to adapt the package to also 
handle the detectors used at FEL beamlines and to make 
nXDS and its documentation available from the internet. 

I thank Anton Meinhart for providing the crystal and 
analysis of the anomalous phasing power of the XDS and 
nXDS processed data, lime Schlichting for measuring the test 
data sets, discussions and support of this work, and Bruce 
Doak for reading the manuscript. I am grateful to Kay 
Diederichs and Michael Junk for inspiring communications 
about resolving the indexing ambiguity and scaling. 
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